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PETITION TO REVIVE UNDER 37 C.F.R. S 1.181 



Sir: 



Applicant has received the Notice of Abandonment dated January 21, 2004, in 
error. Although the Notice of Abandonment states that no response to the Notice to File 
Missing Parts mailed on April 6, 2001 was received, in fact a response to the Notice to File 
Missing Parts was timely filed on June 29, 2001. Copies of all the papers filed in response to 
the Notice to File Missing Parts, as filed on June 29, 2001, are attached, which copies include 
a copy of the postcard receipt dated July 3, 2001. 

The Petitions Branch is asked to vacate the Notice of Abandonment dated 
January 21, 2004 and to return the above-identified application to the Examining Group for 
examination. The Petitions Branch is also requested to arrange if possible for the docketing 
of this application so that it can be taken up for examination without further delay, in view of 
the early filing date of the application. 

Respectfully submitted, 
WEBB ZIESENHEIM LOGSDON 
ORKIN & HANSON, P.C. 



I hereby certify that this correspondence is being deposited 
with the United States Postal Service as first class mail in an 
envelope addressed to Mai! Stop Petition, Commissioner for 
Patents, P.O. Box 1450, Alexandria, VA 22313-1450 on 
March 16, 2004. 

Kimberlv N. Weldav 
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/] Date 
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'Barbara E/3x|Smson 
RegistratioffNo. 31,198 
Attorney for Applicant 
700 Koppers Building 
436 Seventh Avenue 
Pittsburgh, Pennsylvania 15219-1818 
Telephone: 412-471-8815 
Facsimile: 412-471-4094 
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United States Bstent and Trademark Office _ 

UNITED STATES DEPARTMENT OF COMMERCE 
United S tn test Pntcnt and Trademark Office 

Addrew: COMMISSIONER FOR PATENTS 
P.O.Box H50 

Alexandria, VtiBinia 22313.1450 

WWW.UlptO.gDV 

APPLICATION NUMBER | FILING OR 371(C) DATE | FIRST NAMED APPLICANT | ATTY. DOCKET NO/TITLE ( 

09/788,269 02/16/2001 Jonathan W. Jarvik 2087-010261 

CONFIRMATION NO. 5283 
ABANDONMENT/TERMINATION 
LETTER 

IllIlllllllllllMDllUllllll 

*OC00000001 1727589* 



Date Mailed: 01/21/2004 



NOTICE OF ABANDONMENT UNDER 37 CFR 1.53 (f) OR (g) 

The above-identified application is abandoned for failure to timely or properly reply to the Notice to File Missing 
Parts (Notice) mailed on 04/06/2001. 

• No reply was received. 

A petition to the Commissioner under 37 CFR 1 .137 may be filed requesting that the application be revived. 

Under 37 CFR 1.137(a), a petition requesting the application be revived on the grounds of UNAVOIDABLE 
DELAY must be filed promptly after the applicant becomes aware of the abandonment and such petition must be 
accompanied by: (1) an adequate showing of the cause of unavoidable delay; (2) the required reply to the above- 
identified Notice; (3) the petition fee set forth in 37 CFR 1.17(1); and (4) a terminal disclaimer if required by 37 
CFR 1.137(d). 

Under 37 CFR 1.137(b), a petition requesting the application be revived on the grounds of UNINTENTIONAL 
DELAY must be filed promptly after applicant becomes aware of the abandonment and such petition must be 
accompanied by: (1) a statement that the entire delay was unintentional; (2) the required reply to the above- 
identified Notice; (3) the petition fee set forth in 37 CFR 1.1 7(m); and (4) a terminal disclaimer if required by 37 
CFR 1.137(d). 

Any questions concerning petitions to revive should be directed to the "Office of Petitions" at (703) 305- 

9282. Petitions should be mailed to: Mail Stop Petitions, Commissioner for Patents, P.O. Box 1450, Alexandria VA 

22313-1450. 




Barbara E. Johnson 

WEBB ZIESENHEIM LOGSDON ORKIN & HANS 
700 Koppers Building 
436 Seventh Avenue 
Pittsburgh, PA 15219-1818 




A copy of this notice MUST be returned with the reply. 




Customer Service Center 
Initial Patent Examination Division (703) 308-1202 
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indication that the accompanying paper 

Sequence Amendment (21 pp.) 
Sequence Listing (5 pp) 
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Applicant(s) Jonathan W. JARVTK 
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PATENT APPLICATION 

Serial No. 09/788,269 
Atty. Docket No. 2087-010261 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Group Art Unit 1 645 

In re application of : 

Jonathan W. JARVIK METHODS AND PRODUCTS FOR 

PEPTIDE-BASED cDNA 
Serial No. 09/788,269 CHARACTERIZATION AND ANALYSIS 

Filed February 16, 2001 : 

Examiner Not Yet Assigned 

Pittsburgh, Pennsylvania 
June 29, 2001 

SEQUENCE AMENDMENT 

Box SEQUENCE 
Commissioner for Patents 
Washington, DC 20231 
Sir: 

Pursuant to 37 C.F.R. § 1.821 et seq., Applicant submits the following Sequence 
Listing and corresponding computer readable form (CRF) for insertion into the specification. A 
copy of the Notice to File Missing Parts of Nonpro visional Application is also enclosed. 
IN THE SPECIFICATION : 

Please insert the attached Sequence Listing into the above-identified patent 

application. 




Copy 



I hereby certify that this correspondence is being deposited with the 
United States Postal Service as First Class Mail in an envelope 
addressed to Commissioner for Patents, Washington, D.C. 20231 on 
June 29, 2001. 

Barbara E. Johnson. Registration No. 31 .198 




(Name of Registered Representative) 



'06/29/2001 



Date 



On pages 7-8, please delete paragraph 0018, and insert the following replacement 
paragraph 0018. Pursuant to 37 C.F.R. § 1.121, the following is a clean copy of the replacement 
paragraph. A marked-up copy of the replacement paragraph is attached on separate sheets. 

[0018] If the nucleotide sequence is random, the probability that a sequence of 
given length translated from it will have a particular amino acid sequence can be calculated simply 
by multiplying together the frequencies in the genetic code of the codons encoding each amino acid 
in the sequence. Since some amino acids have as many as six codons and others as few as one, the 
predicted frequency will vary depending on the amino acid sequence itself. Thus the sequence 
LRJRLLR (SEQ ID NO:. 1), made up entirely of six-codon amino acids, will appear at a frequency 

of 1 in (6/61) 6 , or approximately once in a million codons, and the sequence MWWMMW (SEQ 

ID NO: 2), made up entirely of one-codon amino acids, will appear at a frequency of 1 in (1/61) 6 , 

or approximately once in fifty billion codons. The frequencies of other sequences will fall between 
these two extremes. The important point for us is that even a relatively short sequence will appear 
very rarely, and so if we can determine the amino acid sequence of a peptide translated from 
unknown sequence, we can match it to a portion of the reference sequence with high specificity. 

On page 15, please delete paragraph 0030, and insert the following replacement 

paragraph 0030. Pursuant to 37 C.F.R. § 1.121, the following is a clean copy of the replacement 

paragraph. A marked-up copy of the replacement paragraph is attached on separate sheets. 

[0030] Comparison of the experimental results with the values in the table 
indicates reveals a match to the predicted mass value for one of the ten candidates - specifically 
the sequence that begins, at position 3190 of the reference sequence and proceeds from right to 
left. Retrieval of the reference sequence beginning at position 3190 indicates that the cloned 
sequence begins with "GAATTCTTACACCTCATACTTTCCCAAGCCCCAACTTTCTCATCT 
GAAAATGGTAATAGTATCATCCTTACATGTTTAAGGTCATGAATTGCTAT 
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GTGTA (1st 100 nucleotides shown) (SEP ID NO: 3Y The identification is confirmed by 

dideoxy sequencing from a primer 150 nucleotides upstream of the junction between the pUC19 
sequence and the EcoRI fragment. 

On page 15, please delete paragraph 0032, and insert the following replacement 
paragraph 0032. Pursuant to 37 C.F.R. § L 121, the following is a clean copy of the replacement 
paragraph. A marked-up copy of the replacement paragraph is attached on separate sheets. 

[0032] The peptide TMITPSLHACRSTLED (SEP ID NO: 4Y representing the 
N-terminal 16 amino acids of the alpha-complementing factor of beta-galactosidase encoded in 
pUC19 (and also representing the 16 constant N-terminal amino acids in all of the peptides 
described in Example 1 above) is used to raise a polyclonal rabbit antibody using standard 
procedures. 

On page 16, please delete paragraph 0034, and insert the following replacement 

paragraph 0034. Pursuant to 37 C.F.R. § 1.121, the following is a clean copy of the replacement 

paragraph. A marked-up copy of the replacement paragraph is attached on separate sheets. 

[0034] The mass spectrum of the immunoprecipitate from the induced cell 
lysate of the clone under examination is observed to contain a distinct peak, at a position 
corresponding to a mass of 8485±3 Daltons, that is not observed in the control. Comparison of 
the experimental results with the values in the table in example 1 above indicates that the insert 
begins at position 9241 of the reference sequence and proceeds from left to right in the Genbank 
sequence. Retrieval of the reference sequence beginning at position 9241 indicates that the 
cloned sequence begins with 

GAATTCACATAAATCGCAAATTTTTTTTTCCTTCCCAGAGCC 
ATCCAAAACTCTGTTTGTCAAAGGCCTGTCTGAGGATACCACTGAAGAGA 



CATTAAAG (1st 100 nucleotides shown) (SEP ID NO: 5\ The identification is confirmed by 

dideoxy sequencing as described in Example 1 . 



On page 17, please delete paragraph 0037, and insert the following replacement 

paragraph 0037. Pursuant to 37 C.F.R. § 1.121, the following is a clean copy of the replacement 

paragraph. A marked-up copy of the replacement paragraph is attached on separate sheets. 

[0037] To identify the nucleotide sequence adjacent to the pTriplEx' vector, each 
EcoRI site in the J05584 sequence is identified and ligated, in silico, to the EcoRI site in the 
pTriplEx* vector. For each such in silico construct, the amino acid sequences of the two expected 
hybrid translation products (from each of the start codons in the vector to the first in frame stop 
codons encountered in the insert) are calculated. The mass of each peptide is calculated and all 
10 peptide pairs are tabulated, as shown in the table below. Comparison of the experimental 
results (i.e., peptides of 4255 and 2635 Da.) with the values predicted in the table indicates that 
the insert begins at position 4028 of the reference sequence and proceeds in the forward direction. 
It is concluded that the 5' end of the sequence joined to the vector is 
GAATTCTCTTGGGTT TTGTGGTGTGCTAGACTTAATTACCCATGAATGATTT 
TGTCCTCTTG AG AAAATTTC AAT AGC AC ATCT ATT AGTGTTTTTT AT. . . . ( 1 st 1 00 
nucleotides shown) (SEP ID NO: 6) . The identification is confirmed by dideoxy sequencing 
from the plasmid using a primer 150 nucleotides 3' to the pTriplEx* EcoRI site. 



Position of EcoRI site Orientation in pTriplEx' Start Codon Predicted Peptide Mass 

3190 forward 1st 6137 

3190 forward 2nd 5707 

3190 reverse 1st 6278 

3190 reverse 2nd 3891 

4208 forward 1 st 4255 

4208 forward 2nd 2635 

4208 reverse 1st 19748 

4208 reverse 2nd 3905 



6066 forward 1st 3595 

6066 forward 2nd 3606 

6066 reverse 1st 6401 

6066 reverse 2nd 1363 

9241 forward 1st 3583 

9241 forward - 2nd 7122 

9241 reverse 1st 4582 

9241 reverse 2nd 1746 

9543 forward 1st 5306 

9543 forward 2nd 1477 

9543 reverse 1st . 9906 

9543 reverse 2nd 2516 

The mass values above are computed by translating each hypothetical fusion 

polypeptide without the N-terminal methionine that is removed in vivo in E. coli. 

On page 19, please delete the paragraph 0040, and insert the following replacement 
paragraph 0040. Pursuant to 37 C.F.R. § 1.121, the following is a clean copy of the replacement 
paragraph. A marked-up copy of the replacement paragraph is attached on separate sheets. 

[0040] Two oligonucleotide primers are synthesized using standard methods. In 
one, C C C G AATTC AGC AGGT AAAAATC A A GG fSEQ ID NO: 7) , the first 10 nucleotides 
contain an EcoRI site (underlined) and last 17 nucleotides correspond to the first 17 nucleotides 
of exon 2 of the human nucleolin gene. The other, 
GGGGAATTCTTACTCTTCTCCACTGCTAT (SEP ID NO: 8) . the last 17 nucleotides 
correspond to the reverse complement of the last 17 nucleotides of exon 2, followed immediately 
(in the sense orientation of the oligonucleotide) by the stop codon TAA and a sequence that 
includes an EcoRI site (underlined). 
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On pages 21-24, please delete paragraph 0044, and insert the following 
replacement paragraph 0044. Pursuant to 37 C.F.R. § 1.121, the following is a clean copy of the 
replacement paragraph. A marked-up copy of the replacement paragraph is attached on separate 
sheets. 

[0044] The program was run with the 24 nucleotide input sequence 
CAACTAGAAGAGGTAAGAAACTAT (SEP ID NO: 9V Two reading frames were selected; 
the forward reading frame beginning with the first nucleotide (Fl) and the reverse (antisense) 
reading frame beginning with the second antisense nucleotide (R2). The results are shown below. 

[begin] 

Enter Sequence: 

[input] CAACTAGAAGAGGTAAGAAACTAT (SEP ID NO: 9) 

[output] Protein: QLEEVRNY fSEP ID NP: 10) 

Which reading frames would you like to examine? 

1: Forward (Fl) 

2: Forward; first base removed (F2) 
3: Forward; second base removed (F2) 
4: Reverse (Rl) 

5: Reverse first base removed (R2) 

6: Reverse second removed (R3) 
[input] 1,5 
[output] MASS DIFFERENCES 

Location Mutation Frame Fl Frame R2 

None 1032.13 722.89 
/A(K) 0.04 0.00 

1 C-{G(E) 0.99 0.00 

\T(Z) -1032.13 0.00 
/G(R) 28.06 0.00 

2 (Q)A-{T(L) -14.97 0.00 

\C(P) -31.01 0.00 
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/G(Q) 
A-{ T(H) 
\C(H) 



0.00 

9.01 
9.01 



0.00 

0.00 
0.00 



/A(I) 0.00 276.34 

C-{ G(V) -14.03 276.34 

\T(L) 0.00 0.00 

/C(P) -16.04 299.37 

(L)T-{A(Q) 14.97 226.32 

\G(R) 43.03 200.24 

/G(L) 0.00 241.29 

A-{T(L) 0.00 241.33 

\C(L) 0.00 242.28 



/T(Z) 
G-{ C(Q) 

\A(K) 

/G(G) 
(E)A-{T(V) 

\C(A) 

/G(E) 
A-{T(D) 

\C(D) 



-790.84 
-0.99 
-0.95 
-72.07 

-29.99 
-58.04 
0.00 
-14.03 
-14.03 



-34.02 
-34.02 
0.00 
-60.10 

16.00 
-44.04 
34.02 

-34.02 
-48.05 



/T(Z) 

10 G-{ C(Q) 
\A(K) 
/G(G) 

11 (E)A-{T(V) 

\C(A) 
/T(D) 

12 G-{ C(D) 



■661.72 
-0.99 
-0.95 
-72.07 

-29.99 
-58.04 
-14.03 
-14.03 



0.00 

0.00 
0.00 
-16.04 

23.98 
43.03 
0.00 
-14.03 " 
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\A(E) 0.00 34.02 

/T(L) 14.03 -423.52 

13 G-{C(L) 14.03 -423.52 
\A(I) 14.03 0.00 
/C(A) -28.05 -60.04 

14 (V)T-{A(E) 29.99 -16.00 

\G(G) -42.08 -76.10 

/G(V) 0.00 -26.04 

15 A-{T(V) 0.00 -49.08 
\C(V) 0.00 -48.09 

/G(G) -99.14 0.00 

16 A-{T(Z) -433.47 0.00 
\C(R) 0.00 0.00 
/T(I) -43.03 76.10 

17 (R)G-{C(T) -55.09 16.06 

\A(K) -28.02 60.10 

/G(R) 0.00 10.04 

18 A-{T(S) -69.11 14.02 
\C(S) -69.11 -16.00 

/G(D) 0.99 0.00 

19 A-{T(Y) 49.08 0.00 
\C(H) 23.04 0.00 
/G(S) -27.02 -28.05 

20 (N)A-{T(I) -0.94 15.96 

\C(T) -13.00 -42.08 

/A(K) 14.07 48.05 

21 C-{G(K) 14.07 14.03 
\T(N) 0.00 14.03 
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/C(H) -26.04 18.03 

\G(D) -49.08 0.00 

22 T-{A(N) -48.09 0.00 
/G(C) -60.04 -12.06 

23 (Y)A-{T(F) -16.00 15.01 

\C(S) -76.10 43.03 

/C(Y) 0.00 -14.03 

24 T-{A(Z) -163.18 0.00 
\G(Z) -163.18 0.00 

Enter the detection threshold: 

[input] 0.8Dalton. 

[output] Undetectable amino acid substitutions: l.(Q)C-A(K) 



On page 26, please delete paragraph 0048, and insert the following replacement 

paragraph 0048. Pursuant to 37 C.F.R. § 1.121, the following is a clean copy of the replacement 

paragraph. A marked-up copy of the replacement paragraph is attached on separate sheets. 

[0048] The sequence of exon 2 of the human rds/peripherin gene (Genbank 
accession M73531) is shown below. Intron sequence is shown in lower case; exon sequence in 
upper case. 

gggaagcccatctccagctgtctgtttccctttaagTCGAATCAAGAGCAACGTGGATGGGCG 
GTACCTGGTGGACGGCGTCCCTTTCAGCTGCTGCAATCCTAGCTCGCCACGGCCCTGC 
ATCCAGTATCAGATCACCAACAACTCAGCACACTACAGTTACGACCACCAGACGGAG 
GAGCTCAACCTGTGGGTGCGTGGCTGCAGGGCTGCCCTGCTGAGCTACTACAGCAGCC 
TCATGAACTCCATGGGTGTCGTCACGCTCCTCATTTGGCTCTTCGAGgtaggccctgggcagctg 
ggggtagagggtaaggagagcctcc CSEQ ED NO: in 
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On pages 26-27, please delete paragraph 0049, and insert the following 
replacement paragraph 0049. Pursuant to 37 C.F.R. § 1.121, the following is a clean copy of the 
replacement paragraph. A marked-up copy of the replacement paragraph is attached on separate 
sheets. 

[0049] Two primers, of sequences 

GGCCCGGAATTCTCCAGCTGTCTGTTTCCCTTTAAG (SEP ID NO: 12) and 
AATTTACTCGAGCTACCCCCAGCTGCCCAGGGCCTAC (SEP ID NO: 13) were synthesized and 
used to PCR amplify rds/peripherin exon 2 from an individual known to carry a wild type allele of 
rds/peripherin. The amplicon was cut with EcoRI and Xhol and cloned into the EcoRI/XhoI sites of the 
pGEX derivative described in Nelson et al. The resulting plasmid was cut with Xho 1, treated with 
Klenow fragment of DNA polymerase, and self-ligated to produce a construct expected to produce a 
fusion protein with the sequence shown below. 

MSPILGYWKIKGLVQPTRL^ 

FPNLPYYroGDVKLTQSMAmYIADKHNMLGGCPK^ 

DFETLKVDFLSKLPEMLKMFEDRLCHKTYLNGDHVTHPDFMLYDA 

AFPKLVCFK3CRJEAIPQ 

TTPHHTTPHHTTPHHTTPQDLNSPAVCFPLSRIKSNVDGRYLVDGVPFSCCNPSSPRPCIQY . 
QITNN S AHY S YDH QTEELNL WVRGCRAALLS YY S S LMNS MG V VTLLI WLFE V GP GQ LG V 
ARSSGRTVTD fSEO ID NO: 14) 



On page 27, please delete paragraph 0053, and insert the following replacement 
paragraph 0053. Pursuant to 37 C.F.R. § 1.121, the following is a clean copy of the replacement 
paragraph. A marked-up copy of the replacement paragraph is attached on separate sheets. 

[0053] The amplicons described in the previous example are reamplified using 
the upstream primer 
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5'GGATCCTAATACGACTCACTATAGGGAGACCACCATGCATCACCATCATCACCATCA 
CCACTCTCCAGCTGTCTGTTTCCCTTTAAG (SEP ID NO: 15) and the downstream primer 
5' CTTAGTCATTATACCCCCAGCTGCCCAGGGCCTAC (SEP ID NO: 16) . The upstream 
primer contains a T7 promoter followed by a translation initiation sequence (start codon 
underlined) followed by a sequence encoding eight histidines followed by sequence identical to 
the red/peripherin sequence immediately 5' to rds/peripherin exon 2. The downstream primer 
contains two stop codons (in antisense orientation) preceding the sequence complimentary to the 
sequence just 3' to red/peripherin exon 2, 

On page 30, please delete paragraph 0061, and insert the following replacement 
paragraph 0061. Pursuant to 37 C.F.R. § 1.121, the following is a clean copy of the replacement 
paragraph. A marked-up copy of the replacement paragraph is attached on separate sheets. 

[0061] Because the primers are all anchored by non-T nucleotides at their 3' 
ends, only three of them will prime a given cDNA sequence. In the case of the hemoglobin alpha 
2 transcript, which ends in the sequence GCGGCAAAAAAAAAAAAAAAAAAAAAAA. . 
(SEP ID NP: 17) the primers that are extended are those ending in G. 
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REMARKS 

Pursuant to the requirements of 37 C.F.R. §§ 1.821-1.825, Applicant submits the 
enclosed Sequence Listing and computer readable form (CRP). The amino acid sequences 
disclosed in the specification and drawings may be found in computer readable form in file 
010261.txt on the enclosed diskette and are presented in the paper copy of the Sequence Listing, 
enclosed. 

Applicant hereby certifies that the information recorded in computer readable 
form (CRF) supplied on the enclosed diskette as file 010261.txt is identical to the written 
Sequence Listing. The material presented in computer readable form is not new matter because 
it presents sequences the same as those disclosed in the specification, as filed. 

Applicant believes that the requirements of 37 C.F.R. §§ 1.821-1.825 have been 

met. 

Respectfully submitted, 

WEBB ZIESENHEIM LOGSDON 
ORKJN & HANSON, P.C 




Barbarad^3t5linson 
Registration No. 31,198 
Attorney for Applicant 
700 Koppers Building 
436 Seventh Avenue 
Pittsburgh, PA 15219-1818 
Telephone: 412-471-8815 
Facsimile: 412-471-4094 
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MARKED-UP AMENDED SPECIFICATION PARAGRAPHS 

[0018] If the nucleotide sequence is random, the probability that a sequence of 
given length translated from it will have a particular amino acid sequence can be calculated simply 
by multiplying together the frequencies in the genetic code of the codons encoding each amino acid 
[amino acid] in the sequence. Since some amino acids have as many as six codons and others as 
few as one, the predicted frequency will vary depending on the amino acid sequence itself. Thus 
the sequence LRRLLR. fSEO ED NO: IV made up entirely of six-codon amino acids, will appear 
at a frequency of 1 in (6/6 l) 6 , or approximately once in a million codons, and the sequence 
MWWMMW (SEP ID NO: 2V made up entirely of one-codon amino acids, will appear at a 
frequency of 1 in (1/61) 6 , or approximately once in fifty billion codons. The frequencies of other 

sequences will fall between these two extremes. The important point for us is that even a relatively 
short sequence will appear very rarely, and so if we can determine the amino acid sequence of a 
peptide translated from unknown sequence, we can match it to a portion of the reference sequence 
with high specificity. 

[0030] Comparison of the experimental results with the values in the table 
indicates reveals a match to the predicted mass value for one of the ten candidates - specifically 
the sequence that begins at position 3 190 of the reference sequence and proceeds from right to 
left. Retrieval of the reference sequence beginning at position 3190 indicates that the cloned 
sequence begins with "GAATTCITACACCTCATACTTTCCCAAGCCCCAACTTTCTCATCT 
GAAAATGGTAATAGTATCATCCTTACATGTTTAAGGTCATGAATTGCTAT 
GTGTA .(1st 100 nucleotides shown) (SEP ID NO: 3V The identification is confirmed by 

dideoxy sequencing from a primer 150 nucleotides upstream of the junction between the pUC19 

sequence and the EcoRI fragment. 



[0032] The peptide TMITPSLHACRSTLED (SEP ID NO: 4\ representing the 
N-terminal 16 amino acids of the alpha-complementing factor of beta-galactosidase encoded in 
pUC!9 (and also representing the 16 constant N-terminal amino acids in all of the peptides 
described in Example 1 above) is used to raise a polyclonal rabbit antibody using standard 
procedures. 

[0034] The mass spectrum of the immunoprecipitate from the induced cell 
lysate of the clone under examination is observed to contain a distinct peak, at a position 
corresponding to a mass of 8485±3 Daltons, that is not observed in the control Comparison of 
the experimental results with the values in the table in example 1 above indicates that the insert 
begins at position 9241 of the reference sequence and proceeds from left to right in the Genbank 
sequence. Retrieval of the reference sequence beginning at position 9241 indicates that the 
cloned sequence begins with 

G^TTCACATAAATCGCAAATTTTTTTTTCCTTCCCAGAGCC 
ATCCAAAACTCTGTTTGTCAA.\GGCCTGTCTGAGGATACCACTGAAGAGA 

CATTAAAG (1st 100 nucleotides shown) f SEP ID NO: 5V The identification is confirmed by 

dideoxy sequencing as described in Example L 

[0037] To identify the nucleotide sequence adjacent to the pTriplEx* vector, each 
EcoRI site in the JQ5584 sequence is identified and ligated, in silico, to the EcoRI site in the 
pTriplEx' vector. For each such in silico construct, the amino acid sequences of the two expected 
hybrid translation products (from each of the start codons in the vector to the first in frame stop 
codons encountered in the insert) are calculated. The mass of each peptide is calculated and all 
10 peptide pairs are tabulated, as shown in the table below. Comparison of the experimental 
results (i.e., peptides of 4255 and 2635 Da.) with the values predicted in the table indicates that 
the insert begins at position 4028 of the reference sequence and proceeds in the forward direction. 
It is concluded that the 5' end of the sequence joined to the vector is 
GAATTCTCTTGGGTT TTGTGGTGTGCTAGACTTAATTACCCATGAATGATTT 
TGTCCTCTTGAGAAAATTTCAATAGCACATCTATTAGTGTTTTTTAT....(lst 100 



nucleotides shown) (SEP ID NO: 6Y The identification is confirmed by dideoxy sequencing 
from the plasmid using a primer 1 50 nucleotides 3' to the pTriplEx' EcoRI site. 



Position of EcoRI site Orientation in pTriplEx' Start Codon Predicted Peptide Mass 

3190 forward 1st 6137 

3190 forward 2nd 5707 

3190- reverse ■ 1st 6278 

3190 reverse -2nd 3891 

4208 forward 1st 4255 

4208 forward 2nd 2635 

4208 reverse 1st 19748 

4208 reverse 2nd 3905 

6066 forward 1st 3595 

6066 forward 2nd 3606 

6066 reverse 1st 6401 

6066 reverse 2nd 1363 

9241 forward 1st 3583 

9241 forward 2nd 7122 

9241 reverse 1st 4582 

9241 reverse 2nd 1746 

9543 forward 1st 5306 

9543 forward 2nd ,1477 

9543 reverse 1st 9906 

9543 reverse 2nd . 2516 

The mass values above are computed by translating each hypothetical fusion 

polypeptide without the N-terminal methionine that is removed in vivo in E. coli. 



[0040] Two oligonucleotide primers are synthesized using standard methods. In 
one, CCCGAATTCAGCAGGTAAAAATCAAGG (SEP TP NO: 7\ the first 10 nucleotides 
contain an EcoRI site (underlined) and last 17 nucleotides correspond to the first 17 nucleotides 
of exon 2 of the human nucleolin gene. The other, 
GGGGAATTCTTACTCTTCTCCACTGCTAT (SEP ID NO: 8\ the last 17 nucleotides 
correspond to the reverse complement of the last 17 nucleotides of exon 2, followed immediately 
(in the sense orientation of the oligonucleotide) by the stop codon TAA and a sequence that 
includes an EcoRI site (underlined). 

[0044] The program was run with the 24 nucleotide input sequence 
CAACTAGAAGAGGTAAGAAACTAT fSEO ID NO: 9) . Two reading frames were selected; 
the forward reading frame beginning with the first nucleotide (Fl) and the reverse (antisense) 
reading frame beginning with the second antisense nucleotide (R2). The results are shown below. 

[begin] 

Enter Sequence: 

[input] CAACTAGAAGAGGTAAGAAACTAT fSEO ID NO: 9) 

[output] Protein: QLEEVRNY (SEP ID NO: 10) 

Which reading frames would you like to examine? 
.1: Forward (Fl) 

2: Forward; first base removed (F2) 

3: Forward; second base removed (F2) 

4: Reverse (Rl) 

5: Reverse first base removed (R2) 

6: Reverse second removed (R3) 
[input] 1,5 
[output] MASS DIFFERENCES 



Location Mutation Frame Fl 



Frame R2 



None 1032.13 722.89 

/A(K) 0.04 0.00 

C-{ G(E) 0.99 .0.0.0 

\T(Z) -1032.13 0.00 

/G(R) 28.06 0.00 

(Q)A-{T(L) -14.97 6.00 

\C(P) -31.01' 0.00 

/G(Q) 0.00 0.00 

A-{T(H) 9.01 0.00 

\C(H) 9.01 0.00 

/A(I) 0.00 276.34 

C-{G(V) -14.03 276.34 

\T(L) 0.00 0.00 

/C(P) -16.04 299.37 

(L)T-{A(Q) 14.97 226.32 

\G(R) 43.03 200.24 

/G(L) 0.00 241.29 

A-{T(L) 0.00 241.33 

\C(L) 0.00 242.28 

/T(Z) -790.84 -34.02 

G-{C(Q) -0.99 -34.02 

\A(K) -0.95 0.00 

/G(G) -72.07 -60.10 

(E)A-{T(V) -29.99 16.00 

\C(A) -58.04 -44.04 

/G(E) 0.00 -34.02 

A-{T(D) -14.03 -34.02 

\C(D) -14.03 -48.05 



/T(Z) -661.72 



0.00 



10 G-{C(Q) -0.99 0.00 
\A(K) -0.95 0.00 
/G(G) -72.07 -16.04 

11 (E)A-{T(V) -29.99 23.98 

\C(A) -58.04 43.03 

/T(D) -14.03 0.00 

12 G-{C(D) -14.03 -14.03 
\A(E) 0.00 34.02 

/T(L) 14.03 - -423.52 

13 G-{C(L) 14.03 -423.52 
\A(I) 14.03 0.00 
/C(A) -28.05 -60.04 

14 (V)T-{A(E) 29.99 -16.00 

\G(G) -42.08 -76.10 

/G(V) 0.00 -26.04 

15 A-{T(V) 0.00 -49.08 
\C(V) 0.0.0 -48.09 

/G(G) -99.14 0.00 

16. A-{T(Z) -433.47 0.00 

\C(R) 0.00 0.00 

/T(I) -43.03 76.10 

17 (R)G-{C(T) -55.09 16.06 

\A(K) -28.02 60.10 

/G(R) 0.00 10.04 

18 A-{T(S) -69.11 14.02 
\C(S) -69.11 -16.00 

/G(D) 0.99 0.00 

19 A-{T(Y) 49.08 0.00 
\C(H) 23.04 0.00 



/G(S) -27.02 -28.05 

. 20 (N)A-{T(I) -0.94 15.96 

\C(T) -13.00 -42.08 

/A(K) 14.07 48.05 

21 C-{G(K) 14.07 14.03 
\T(N) 0.00 14.03 

/C(H) -26.04 18.03 

\G(D) -49.08 0.00 

22 T-{A(N) -48.09 0.00 
/G(C) -60.04 -12.06 

23 (Y)A-{T(F) -16.00 15.01 

\C(S) -76.10 43.03 

/C(Y) 0.00 -14.03 

24 T-{A(Z) -163.18 0.00 
\G(Z) -163.18 0.00 

Enter the detection threshold: 

[input] ■ 0.8 Dalton. 

[output] Undetectable amino acid substitutions: l.(Q)C-A(K) 

[0048] The sequence of exon 2 of the human rds/peripherin gene (Genbank 
accession M73531) is shown below. Intron sequence is shown in lower case; exon sequence in 
upper case. 

gggaagcccatctccagctgtctgtttccctttaagTCGAATCAAGAGCAACGTGGATGGGCG 
GTACCTGGTGGACGGCGTCCCTTTCAGCTGCTGCAATCCTAGCTCGCCACGGCCCTGC 
ATCCAGTATCAGATCACCAACAACTCAGCACACTACAGTTACGACCACCAGACGGAG 
GAGCTCAACCTGTGGGTGCGTGGCTGCAGGGCTGCCCTGCTGAGCTACTACAGCAGCC 
TCATGAACTCCATGGGTGTCGTCACGCTCCTCATTTGGCTCTTCGAGgtaggccctgggcagctg 
ggggtagagggtaaggagagcctcc (SEP ED NO: m 



[0049] Two primers, of sequences 

GGCCCGGAATTCTCCAGCTGTCTGTTTCCCTTTAAG (SEP TP NO: 121 and 
AATTTACTCGAGCTACCCCCAGCTGCCCAGGGCCTAC (SEP ID NO: 131 were synthesized and 
used to PCR amplify rds/peripherin exon 2 from an individual known to carry a wild type allele of 
rds/peripherin. The amplicon was cut with EcoRI and Xhol and cloned into the EcoRI/XhoI sites of the 
pGEX derivative described in Nelson et al. The resulting plasmid was cut with Xho 1, treated, with 
Klenow fragment of DNA polymerase, and self-ligated to produce a construct expected to produce a 
fusion protein with the sequence shown below. 

MSPILGYWKJKGLVQPTRLLLEYLEEKYEEHLYERDEGDKWRNKKFELGLE 

FPNLPYYrDGDVKLTQSMAimYIADKHNMLGGCPKERAEISMLEGAVLDrRYGVSRIAYSK 
DFETLKVDFLSKLPEMLKMFEDRLCHKTYLNGDHVTHPDFMLYDALDVVLYMDPMCLD 
APPKLVCFKKPJEAIPQIDKYLKSSKYIAWPLQGWQATFGGGDHPPKSDLIEGRGIQDLVPH 
TTPHHTTPHHTTPHHTTPQDLNSPAVCFPLSRIKSNVDGRYLVDGVPFSCCNPSSPRPCIQY 
QITNNS AH Y S YDHQTEELNL WVRGCRAALLS YY S S LMNSMG V VTLLIWLFE V GP GQLGV 
ARSSGRJVTD CSEO ID NO: 141 

[0053] The amplicons described in the previous example are reamplified using 
the upstream primer 

5'GGATCCTAATACGACTCACTATAGGGAGACCACCATGCATCACCATCATCACCATCA 
CCACTCTCCAGCTGTCTGTTTCCCTTTAAG (SEP ID NO: 151 and the downstream primer 
5" CTTAGTCATTATACCCCCAGCTGCCCAGGGCCTAC (SEP ID NP: 161 . The upstream 
primer contains a T7 promoter followed by a translation initiation sequence (start codon 
underlined) followed by a sequence encoding eight histidines followed by sequence identical to 
the red/peripherin sequence immediately 5' to rds/peripherin exon 2. The downstream primer 




contains two stop codons (in antisense orientation) preceding the sequence complimentary to the 
sequence just 3' to red/peripherin exon 2. 

[0061] Because the primers are all anchored by non-T nucleotides at their 3' 
ends, only three of them will prime a given cDNA sequence. In. the case of the hemoglobin 
alpha 2 transcript, which ends in the sequence 

GCGGCAAAAAAAAAAAAAAAAAAAAAAA. . fSEQ ID NO: 17) the primers that are 
extended are those ending in G. 



SEQUENCE LISTING 




<110> Jarvik, Jonathan W. 



'<g^l20> Methods and Products for Peptide-Based cDNA 
Characterization and Analysis 

30> 2087 010261 

.sgiiS^ <140> US 09/788,269 
<141> 2001-02-16 



Copy 



<150> US 60/182 , 983 
<151> 2000-02-16 

<160> 17 

<170> Microsoft Word 97 SR-2 • 

<210> 1 
<211> 6 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Example of sequence made up entirely of six-codon amino acids 
<400> 1 

Leu Arg Arg Leu Leu. Arg 
1 5 

<210> 2 
<211> 6 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Example of sequence made up entirely of one-codon amino .acids 
<400> 2 

Met Trp Trp Met Met Trp 
1 5 

<210> 3 

<211> 100. 

<212> DNA 

<213> Homo sapiens 

<400> 3 

gaattcttac acctcatact ttcccaagcc ccaactttct catctgaaaa tggtaatagt 60 



atcatcctta catgtttaag gtcatgaatt gctatgtgta 

<210> 4 

<211> 16 

<212> PRT 

<213> Homo sapiens 



100 



<400> 4 

Thr Met lie Thr Pro Ser Leu His Ala Cys Arg Ser Thr Leu Glu Asp 
1 5 10 15 




* 



c 



<210> 5 
<211> 100 
<212> DNA 

<213> Homo sapiens 
<400>- 5 

gaattcacat aaatcgcaaa tttttttttc cttcccagag ccatccaaaa ctctgtttgt 60 

caaaggcctg tctgaggata ccactgaaga' gacattaaag 100 

<210> 6 

<211> 99 

<212> DNA 

<213> Homo sapiens 



<210> 7 

<211> 27 ^ 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> SITE 
<222> (4) . . (9) 

<223> Oligonucleotide primer containing EcoRI site 
<400> 7 

cccgaattca gcaggtaaaa atcaagg . 27 

<210> 8 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<221> SITE 

<222> (4) . . (9) 

<223> Oligonucleotide primer containing EcoRI site 



' <210> 9 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Nucleotide input sequence used to deonstrate computer program 
capabilities 



<400> 6 

gaattctctt gggttttgtg gtgtgctaga cttaattacc catgaatgat tttgtcctct 60 



tgagaaaatt tcaatagcac atctattagt gttttttat 



<400> 8 

ggggaattct tactcttctc cactgctat 



29 



<400> 9 

caactagaag aggtaagaaa ctat 



24 



<210> 
<211> 



10 
8 



<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Computer program output of encoded peptides 
<400> 10 

Gin Leu Glu Glu Val Arg Asn Tyr 

<210> 11 

<211> 326 

<212> DNA 

<213> Homo sapiens 

<220> 

<2 21> exon 

<222> (37) .'. (283) 



<400> 11 

gggaagccca 

gcggtacctg 

catccagtat 

gctcaacctg 

gaactccatg 

gggggtagag 



tctccagctg tctgtttccc tttaagtcga atcaagagca acgtggat'gg 60 
gtggacggcg tccctttcag ctgctgcaat cctagctcgc cacggccctg 120 
cagatcacca acaactcagc acactacagt tacgaccacc agacggagga 180 
tgggtgcgtg gctgcagggc tgccctgctg agctactaca gcagcctcat 24 0 
ggtgtcgtca cgctcctcat ttggctcttc gaggtaggcc ctgggcagct 3 00 
ggtaaggaga gcctcc 326 



<210> 12 
<211> 36 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer synthesized and used to PCR amplify rds/peripherin exon 2 
from an individual known to carry a wild type allele of 
rds/peripherin. 



<400>- 12 

ggcccggaat tctccagctg tctgtttccc tttaag 



36 



<210> 13 
<211> 37 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer synthesized and. used to PCR amplify rds/peripherin exon 2 
from an individual known to -carry a wild type allele, of 
rds/peripherin. 



<400> 13 

, aatttactcg agctaccccc agctgcccag ggcctac 

<210> 14 
<211> 364 
<212> PRT 

<213> Artificial sequence 



37 



<220> 

<223> Fusion protein 



<400> 14 



• 



Met 


,Ser 


Pro 


He 


Leu 


Gly 


Tyr 


Trp 


1 








5 








Thr 


Arg 


Leu 


Leu 


Leu 


Glu 


Tyr 


Leu 








20 










Tyr 


Glu 


Arg 


Asp 


Glu 


Gly Asp 


Lys 






35 










40 


Gly 


Leu 


Glu 


Phe 


Pro 


Asn 


Leu 


Pro 




50 










55 




Leu 


Thr 


Gin 


Ser 


Met 


Ala 


He 


He 


.65 










70 






Met 


Leu 


Gly Gly Cys 


Pro 


Lys 


Glu 










85 








Gly Ala 


Val 


Leu 


Asp 


He 


Arq 


Tyr 








100 










Lys 


Asp 


Phe 


Glu 


Thr 


Leu 


Lys 


Val 






115 










120 


Met 


Leu 


Lys 


Met 


Phe 


Glu 


Asp 


Arg 




130 










135 




Gly Asp 


His 


Val 


Thr 


His 


Pro 


Asp 


145 










150 






Val 


Val 


Leu 


Tyr 


Met 


Asp 


Pro 


Met 










165 








Val 


Cys 


Phe 


Lys 


Lys 


Arq 


He 


Glu 








180 










Leu 


Lys 


Ser 


Ser 


Lys 


Tyr 


He 


Ala 






195 










200 


Thr 


Phe 


Gly Gly Gly 


Asp 


His 


Pro 




210 










215 




Arg 


Gly 


He 


Gin 


Asp 


Leu 


Val 


Pro- 


225 










230 






Pro 


His 


His 


Thr 


Thr 


Pro 


His 


His 










245 








Pro 


Ala 


Val 


Cys 


Phe 


Pro 


Leu 


Ser 








260 










Arg 


Tyr 


Leu 


Val 


Asp 


Gly Val 


Pro 






275 










280 


Pro. 


Arg 


Pro 


Cys 


He 


Gin 


Tyr 


Gin 




290 










295 




Ser 


Tyr 


Asp 


His 


Gin 


Thr 


Glu 


Glu 


305 










310 






Arg 


Ala 


Ala .Leu 


Leu 


Ser 


Tyr 


Tyr 










325 








Val 


Val 


Thr 


Leu 


Leu 


He 


Trp 


Leu 








340 










Gly Val 


Ala 


Arg 


Ser 


Ser 


Gly Arg 






355 










360 



<210> 15 
<211> 87 
<212> DNA 

<213> Artificial sequence 
<220> 

<221> misc_feature 
<222> (35) . . (37) 
<223> Upstream primer used to 
Start codon at 35-37 



Lys 


He 


Lys 


Gly 


Leu 


Val 


Gin 


Pro 




10 










15 




Glu 


Glu 


Lys 


Tyr 


Glu 


Glu 


His 


Leu 


25 










30 






Trp 


Arg 


Asn 


Lys 


Lys 


Phe 


Glu 


Leu 










45 








Tyr 


Tyr 


He 


Asp 


Gly 


Asp 


Val 


Lys 








60 










Arg 


Tyr 


He 


Ala 


Asp 


Lys 


His 


Asn 






75 










80 


Arg 


Ala 


Glu 


lie 


Ser 


Met 


Leu 


Glu 




90 










95 




Gly Val 


Ser 


Arg 


He 


Ala 


Tyr 


Ser 


105 










110 






Asp 


Phe 


Leu 


Ser 


Lys 


Leu 


Pro 


Glu 










125 








Leu 


Cys 


His 


Lys 


Thr 


Tyr 


Leu 


Asn 








140 










Phe 


Met 


Leu 


Tyr 


Asp 


Ala 


Leu 


Asp 






155 










160 


Cvs 


Leu 


Asp 


Ala 


Phe 


Pro 


Lys 


Leu 




170 










175 




Ala 


He 


Pro 


Gin 


He 


Asp 


Lys 


Tyr 


185 










190 






Tro 


Pro 


Leu 


Gin 


Gly 


Trp 


Gin 


Ala 










205 








Pro 


Lys 


Ser 


Asp 


Leu 


He 


Glu 


Gly 








220 










His 


Thr 


Thr 


Pro 


His 


His 


Thr 


Thr 






235 










240 


Thr 


Thr 


Pro 


Gin 


Asp 


Leu 


Asn 


Ser 




250 










255 




Arg 


He 


Lys 


Ser 


Asn 


Val 


Asp 


Gly 


265 










270 






Phe 


Ser 


Cys 


Cys 


Asn 


Pro 


Ser 


Ser 










285 








He 


Thr 


Asn 


Asn 


Ser 


Ala 


His 


Tyr 








300 










Leu 


Asn 


Leu Trp 


Val 


Arg 


Gly 


Cys 






315 










320 


Ser 


Ser 


Leu 


Met 


Asn 


Ser 


Met 


Gly 




330 










335 




Phe 


Glu 


Val 


Gly 


Pro 


Gly 


Gin 


Leu 


345 










350 






He 


Val 


Thr 


Asp 











amplify amplicons 



<400> 15 

ggatcctaat acgactcact atagggagac 
ctctccagct gtctgtttcc ctttaag 



caccatgcat caccatcatc accatcacca 60 

87 



• 



<210> 16 
<211> 35 
<212> DNA . 

<213> Artificial sequence 
<220> 

<223> Downstream primer used to reamplify amplicons 
<400> 16 

cttagtcatt atacccccag ctgcccaggg cctac 

<210> 17 
<211> 28 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Ending of hemoglobin alpha 2 transcript 
<400> 17 

gcggcaaaaa aaaaaaaaaa aaaaaaaa 
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